Exploration of Projection Spaces¶

In [289]:
!pip install altair
!pip install openTSNE
!pip install umap-learn
!pip install minisom
Requirement already satisfied: altair in /usr/local/lib/python3.10/dist-packages (4.2.2)
Requirement already satisfied: entrypoints in /usr/local/lib/python3.10/dist-packages (from altair) (0.4)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from altair) (3.1.4)
Requirement already satisfied: jsonschema>=3.0 in /usr/local/lib/python3.10/dist-packages (from altair) (4.23.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from altair) (1.26.4)
Requirement already satisfied: pandas>=0.18 in /usr/local/lib/python3.10/dist-packages (from altair) (2.2.2)
Requirement already satisfied: toolz in /usr/local/lib/python3.10/dist-packages (from altair) (0.12.1)
Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.10/dist-packages (from jsonschema>=3.0->altair) (24.2.0)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.10/dist-packages (from jsonschema>=3.0->altair) (2024.10.1)
Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.10/dist-packages (from jsonschema>=3.0->altair) (0.35.1)
Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.10/dist-packages (from jsonschema>=3.0->altair) (0.20.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas>=0.18->altair) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=0.18->altair) (2024.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.10/dist-packages (from pandas>=0.18->altair) (2024.2)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->altair) (3.0.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pandas>=0.18->altair) (1.16.0)
Requirement already satisfied: openTSNE in /usr/local/lib/python3.10/dist-packages (1.0.2)
Requirement already satisfied: numpy>=1.16.6 in /usr/local/lib/python3.10/dist-packages (from openTSNE) (1.26.4)
Requirement already satisfied: scikit-learn>=0.20 in /usr/local/lib/python3.10/dist-packages (from openTSNE) (1.5.2)
Requirement already satisfied: scipy in /usr/local/lib/python3.10/dist-packages (from openTSNE) (1.13.1)
Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.10/dist-packages (from scikit-learn>=0.20->openTSNE) (1.4.2)
Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.10/dist-packages (from scikit-learn>=0.20->openTSNE) (3.5.0)
Requirement already satisfied: umap-learn in /usr/local/lib/python3.10/dist-packages (0.5.7)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from umap-learn) (1.26.4)
Requirement already satisfied: scipy>=1.3.1 in /usr/local/lib/python3.10/dist-packages (from umap-learn) (1.13.1)
Requirement already satisfied: scikit-learn>=0.22 in /usr/local/lib/python3.10/dist-packages (from umap-learn) (1.5.2)
Requirement already satisfied: numba>=0.51.2 in /usr/local/lib/python3.10/dist-packages (from umap-learn) (0.60.0)
Requirement already satisfied: pynndescent>=0.5 in /usr/local/lib/python3.10/dist-packages (from umap-learn) (0.5.13)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from umap-learn) (4.66.6)
Requirement already satisfied: llvmlite<0.44,>=0.43.0dev0 in /usr/local/lib/python3.10/dist-packages (from numba>=0.51.2->umap-learn) (0.43.0)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.10/dist-packages (from pynndescent>=0.5->umap-learn) (1.4.2)
Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.10/dist-packages (from scikit-learn>=0.22->umap-learn) (3.5.0)
Requirement already satisfied: minisom in /usr/local/lib/python3.10/dist-packages (2.3.3)
In [290]:
# Feel free to add dependencies, but make sure that they are included in environment.yml

#disable some annoying warnings
import warnings
warnings.filterwarnings('ignore', category=FutureWarning)

#plots the figures in place instead of a new window
%matplotlib inline

import numpy as np
import pandas as pd
import umap

import matplotlib.pyplot as plt
import altair as alt
from altair import datum
alt.data_transformers.disable_max_rows()

from sklearn import manifold
from sklearn.manifold import MDS
from openTSNE import TSNE
from sklearn.decomposition import PCA
from minisom import MiniSom
In [291]:
tic_tac_toe = pd.read_csv('tic_tac_toe_games_upd_1000.csv')
tic_tac_toe.head()
Out[291]:
pos00 pos01 pos02 pos10 pos11 pos12 pos20 pos21 pos22 player step result game_id state
0 0 0 0 0 0 0 0 0 1 1 0 1 1 start
1 0 0 0 0 -1 0 0 0 1 -1 1 1 1 mid
2 1 0 0 0 -1 0 0 0 1 1 2 1 1 mid
3 1 0 -1 0 -1 0 0 0 1 -1 3 1 1 mid
4 1 0 -1 0 -1 1 0 0 1 1 4 1 1 mid

Data¶

We have generated 1000 plays of a tic tac toe game.

Read and Prepare Data¶

Comments¶

  • Did you transform, clean, or extend the data? How/Why?

TODO

Projection¶

Project your data into a 2D space. Try multiple (3+) projection methods (e.g., t-SNE, UMAP, MDS, PCA, ICA, other methods) with different settings and compare them.

Make sure that all additional dependencies are included when submitting.

In [292]:
meta_data = tic_tac_toe.iloc[:, 9:]
data = tic_tac_toe.iloc[:, :9]

data.head()
Out[292]:
pos00 pos01 pos02 pos10 pos11 pos12 pos20 pos21 pos22
0 0 0 0 0 0 0 0 0 1
1 0 0 0 0 -1 0 0 0 1
2 1 0 0 0 -1 0 0 0 1
3 1 0 -1 0 -1 0 0 0 1
4 1 0 -1 0 -1 1 0 0 1
In [293]:
# one hot encoding
one_hot_data = pd.get_dummies(data)
one_hot_data.head()
Out[293]:
pos00 pos01 pos02 pos10 pos11 pos12 pos20 pos21 pos22
0 0 0 0 0 0 0 0 0 1
1 0 0 0 0 -1 0 0 0 1
2 1 0 0 0 -1 0 0 0 1
3 1 0 -1 0 -1 0 0 0 1
4 1 0 -1 0 -1 1 0 0 1
In [294]:
# TSNE
perplexity_values = [5, 150]
for perplexity in perplexity_values:
    tsne_coords = manifold.TSNE(perplexity=perplexity).fit_transform(one_hot_data)
    df_tsne_coords = pd.DataFrame(tsne_coords, columns=['X', 'Y'])

    df_tsne_coords = pd.concat([df_tsne_coords, meta_data.reset_index(drop=True)], axis=1)
    df_tsne_coords = pd.concat([df_tsne_coords, data.reset_index(drop=True)], axis=1)

    chart = alt.Chart(df_tsne_coords).mark_circle(opacity=0.6).encode(
        x='X',
        y='Y',
        color=alt.value("#4c78a8"),
        tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q','pos00:N','pos01:N','pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
    ).properties(
        width=500,
        height=400,
        title=f"t-SNE Projection with Perplexity {perplexity}"
    ).interactive()
    chart.display()
In [295]:
# UMAP
n_neighbors_values = [5, 150]

for n_neighbors in n_neighbors_values:
    umap_coords = umap.UMAP(n_neighbors=n_neighbors, min_dist=0.1).fit_transform(one_hot_data)
    df_umap_coords = pd.DataFrame(umap_coords, columns=['X', 'Y'])

    df_umap_coords = pd.concat([df_umap_coords, meta_data.reset_index(drop=True)], axis=1)
    df_umap_coords = pd.concat([df_umap_coords, data.reset_index(drop=True)], axis=1)

    chart = alt.Chart(df_umap_coords).mark_circle(opacity=0.6).encode(
        x='X',
        y='Y',
        color=alt.value("#4c78a8"),
        tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q', 'pos00:N','pos01:N','pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
    ).properties(
        width=500,
        height=400,
        title=f"UMAP Projection with n_neighbors {n_neighbors}"
    ).interactive()

    chart.display()
In [296]:
# PCA
pca_coords = PCA(n_components=2).fit_transform(one_hot_data)

df_pca_coords = pd.DataFrame(pca_coords, columns=['X', 'Y'])
df_pca_coords = pd.concat([df_pca_coords, meta_data.reset_index(drop=True)], axis=1)
df_pca_coords = pd.concat([df_pca_coords, data.reset_index(drop=True)], axis=1)

pca_chart = alt.Chart(df_pca_coords).mark_circle(opacity=0.6).encode(
    x='X',
    y='Y',
    color=alt.value("#4c78a8"),
    tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q', 'pos00:N','pos01:N','pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
).properties(
    width=500,
    height=400,
    title="PCA Projection of Tic-Tac-Toe Game States"
).interactive()

pca_chart.display()
In [297]:
# SOM
som_x = 40  # width of SOM grid
som_y = 40 # height of SOM grid

# Initialize the SOM
som = MiniSom(x=som_x, y=som_y, input_len=one_hot_data.shape[1], sigma=1.0, learning_rate=0.5)
som.random_weights_init(one_hot_data.values)

# Train the SOM
som.train_random(data=one_hot_data.values, num_iteration=1000)

# Map each data point to its best matching unit (BMU) on the SOM grid
som_coords = np.array([som.winner(x) for x in one_hot_data.values])
som_df = pd.DataFrame(som_coords, columns=['X', 'Y'])

# Combine SOM coordinates with metadata
som_df = pd.concat([som_df, meta_data.reset_index(drop=True)], axis=1)

# Step 3: Visualization with Altair
chart = alt.Chart(som_df).mark_circle(opacity=0.6).encode(
    x='X:O',  # Ordinal axis for discrete grid coordinates
    y='Y:O',
    color=alt.value("#4c78a8")
).properties(
    width=500,
    height=400,
    title="Self-Organizing Map Projection of Tic-Tac-Toe States"
).interactive()

chart.display()
In [297]:
 
In [298]:
# use low perplexity to prioritize local structures over global
# use high perplexity to search for global structures
tsne_coords = manifold.TSNE(perplexity=100).fit_transform(one_hot_data)
tsne_coords.shape
Out[298]:
(7616, 2)
In [299]:
df_tsne_coords = pd.DataFrame(tsne_coords, columns=['X','Y'])
df_tsne_coords.describe()
Out[299]:
X Y
count 7616.000000 7616.000000
mean 0.333834 -0.282749
std 28.785740 19.241974
min -55.087643 -43.582951
25% -24.864294 -15.417179
50% -0.414234 1.195042
75% 23.959500 14.568724
max 60.936981 38.795654
In [300]:
tsne_ttt = pd.concat([meta_data,df_tsne_coords], axis='columns')
tsne_ttt = pd.concat([tsne_ttt, data.reset_index(drop=True)], axis='columns')
#tsne_ttt.drop(columns=['step', 'player','result'], inplace=True)
tsne_ttt.head()
Out[300]:
player step result game_id state X Y pos00 pos01 pos02 pos10 pos11 pos12 pos20 pos21 pos22
0 1 0 1 1 start -26.452200 26.836319 0 0 0 0 0 0 0 0 1
1 -1 1 1 1 mid -4.331990 8.244121 0 0 0 0 -1 0 0 0 1
2 1 2 1 1 mid 32.401794 21.872248 1 0 0 0 -1 0 0 0 1
3 -1 3 1 1 mid 30.448349 -7.596972 1 0 -1 0 -1 0 0 0 1
4 1 4 1 1 mid 34.384506 -8.554271 1 0 -1 0 -1 1 0 0 1
In [301]:
alt.Chart(tsne_ttt).mark_circle(
    opacity=0.6
).encode(
    x='X',
    y='Y',
    color=alt.value("#4c78a8"),
    tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q', 'pos00:N','pos01:N','pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
).properties(
    width=500,
    height=400,
    title="Projected TTT States"
).interactive()
Out[301]:
In [302]:
alt.Chart(tsne_ttt).mark_point(
    opacity=0.6
).encode(
    x='X',
    y='Y',
    color='game_id:N',
    shape='state:N',
    tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q', 'pos00:N','pos01:N','pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
).properties(
    width=700,
    height=500,
    title="Projected TTT States"
).interactive()
Out[302]:
In [303]:
alt.Chart(tsne_ttt).mark_point(
    opacity=0.6
).encode(
    x='X',
    y='Y',
    color='state:N',
    tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q', 'pos00:N','pos01:N','pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
).transform_filter((datum.state=='start') | (datum.state=='end')
).properties(
    width=700,
    height=500,
    title="Start & End States"
).interactive()
Out[303]:

Comments¶

  • Which features did you use? Why?
  • Which projection methods did you use? Why?
  • Why did you choose these hyperparameters?
  • Are there patterns in the global and the local structure?

TODO

Link States¶

Connect the states that belong together.

The states of a single solution should be connected to see the path from the start to the end state. How the points are connected is up to you, for example, with straight lines or splines.

In [304]:
tsne_ttt = tsne_ttt.rename_axis('index').reset_index()
tsne_ttt.head()
Out[304]:
index player step result game_id state X Y pos00 pos01 pos02 pos10 pos11 pos12 pos20 pos21 pos22
0 0 1 0 1 1 start -26.452200 26.836319 0 0 0 0 0 0 0 0 1
1 1 -1 1 1 1 mid -4.331990 8.244121 0 0 0 0 -1 0 0 0 1
2 2 1 2 1 1 mid 32.401794 21.872248 1 0 0 0 -1 0 0 0 1
3 3 -1 3 1 1 mid 30.448349 -7.596972 1 0 -1 0 -1 0 0 0 1
4 4 1 4 1 1 mid 34.384506 -8.554271 1 0 -1 0 -1 1 0 0 1
In [305]:
alt.Chart(tsne_ttt).mark_line(
    opacity=0.6
).encode(
    x='X',
    y='Y',
    color='game_id:N', #color lines by solving attempt
    order='index:Q' # connect them in order (instead of position on x-axis)
).properties(
    width=700,
    height=500,
    title="Paths of solving attempts"
).interactive()
Out[305]:
In [305]:
 
In [305]:
 
In [305]:
 

Meta Data Encoding¶

Encode addtional features in the visualization.

Use features of the source data and include them in the projection, e.g., by using color, opacity, different shapes, or line styles, etc.

In [306]:
alt.Chart(tsne_ttt).mark_line(
    opacity=0.3
).encode(
    x='X',
    y='Y',
    detail='game_id:N', # draw one line per attempt, but ...
    color='result:N', # .. color the lines per solving strategy
    order='index:Q',
    column='result:N'
).properties(
    title="Solving attempts by player win"
).interactive()
Out[306]:
In [307]:
alt.Chart(tsne_ttt).transform_filter(
    (datum.state == 'end') | (datum.state == 'start') # no intermediate states
).mark_point().encode(
    x='X',
    y='Y',
    shape='state:N',
    color='result:N',
    tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q', 'pos00:N','pos01:N','pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
).properties(
    width=700,
    height=500
).interactive()
Out[307]:
In [308]:
tsne_ttt = tsne_ttt[tsne_ttt['result'] != 0]

alt.Chart(tsne_ttt).mark_line(
    opacity=0.3
).encode(
    x='X',
    y='Y',
    detail='game_id:N', # draw one line per attempt, but ...
    color='result:N', # .. color the lines per solving strategy
    order='index:Q',
    tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q', 'pos00:N','pos01:N','pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
).properties(
    width=700,
    height=700,
    title="Solving attempts by player win"
).interactive() + alt.Chart(tsne_ttt).transform_filter(
    (datum.state == 'end') | (datum.state == 'start') # no intermediate states
).mark_point().encode(
    x='X',
    y='Y',
    shape='state:N',
    color='result:N', # .. color the lines per solving strategy
    tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q', 'pos00:N','pos01:N','pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
).properties(
    width=700,
    height=500
).interactive()
Out[308]:
In [309]:
# Filter to get only the start state points, assuming these represent initial moves.
start_states = tsne_ttt[tsne_ttt['state'] == 'start']

# Create a base chart with lines for each game
line_chart = alt.Chart(tsne_ttt).mark_line(opacity=0.3).encode(
    x='X',
    y='Y',
    detail='game_id:N',
    color='result:N',
    order='index:Q'
).properties(
    width=700,
    height=500,
    title="Solving Attempts by Player Win"
).interactive()

# Add points to indicate start and end states
start_end_points = alt.Chart(tsne_ttt).transform_filter(
    (alt.datum.state == 'end') | (alt.datum.state == 'start')
).mark_point().encode(
    x='X',
    y='Y',
    shape='state:N',
    color='result:N'
).interactive()

# Highlight specific starting positions, e.g., from `start_states`
highlight_chart = alt.Chart(start_states).mark_point(
    size=100,
    color='red'
).encode(
    x='X',
    y='Y',
    tooltip=['game_id', 'state', 'result', 'pos00:N','pos01:N','pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
)

# Combine all layers
final_chart = (line_chart + start_end_points + highlight_chart).properties(
    title="Highlighted Starting Positions in Tic-Tac-Toe States"
)

final_chart
Out[309]:

Distributions¶

Starting player X has different chances of winning if started in different positions. We will describe it in the next graphs.

In [310]:
end_states = tic_tac_toe[tic_tac_toe['state'] == 'end']

result_counts = end_states['result'].value_counts().rename_axis('result').reset_index(name='count')

total_games = result_counts['count'].sum()
result_counts['percentage'] = (result_counts['count'] / total_games) * 100

result_mapping = {1: "Player X Wins", -1: "Player O Wins", 0: "Draw"}
result_counts['description'] = result_counts['result'].map(result_mapping)

print("Win/Loss/Draw Statistics:")
print(result_counts[['description', 'count', 'percentage']])
Win/Loss/Draw Statistics:
     description  count  percentage
0  Player X Wins    579        57.9
1  Player O Wins    278        27.8
2           Draw    143        14.3
In [311]:
df = tic_tac_toe.copy()

# Filter to the first move of each game for Player X (result == 1 indicates X win)
x_start_data = df[(df['step'] == 0) & (df['player'] == 1)]

# Aggregate wins by starting position
winning_counts = []
for pos in ['pos00', 'pos01', 'pos02', 'pos10', 'pos11', 'pos12', 'pos20', 'pos21', 'pos22']:
    total_games = x_start_data[pos].value_counts().get(1, 0)  # Games where X starts in this position
    wins = x_start_data[(x_start_data[pos] == 1) & (x_start_data['result'] == 1)].shape[0]
    win_rate = wins / total_games if total_games > 0 else 0
    winning_counts.append({'position': pos, 'win_rate': win_rate})

# Convert to DataFrame
win_data = pd.DataFrame(winning_counts)


alt.Chart(win_data).mark_bar().encode(
    x=alt.X('position', sort=win_data['position'].tolist(), title='Starting Position'),
    y=alt.Y('win_rate', title='Win Rate for X Starting'),
    tooltip=['position', 'win_rate']
).properties(
    width=600,
    height=400,
    title="Win Rate for X by X's Starting Position"
).interactive()
Out[311]:
In [312]:
# Filter to the first move of each game where X starts
x_start_data = df[(df['step'] == 0) & (df['player'] == 1)]

# Aggregate O wins by X's starting position
o_winning_counts = []
for pos in ['pos00', 'pos01', 'pos02', 'pos10', 'pos11', 'pos12', 'pos20', 'pos21', 'pos22']:
    total_games = x_start_data[pos].value_counts().get(1, 0)  # Games where X starts in this position
    o_wins = x_start_data[(x_start_data[pos] == 1) & (x_start_data['result'] == -1)].shape[0]
    o_win_rate = o_wins / total_games if total_games > 0 else 0
    o_winning_counts.append({'position': pos, 'o_win_rate': o_win_rate})

# Convert to DataFrame
o_win_data = pd.DataFrame(o_winning_counts)

alt.Chart(o_win_data).mark_bar(color='orange').encode(
    x=alt.X('position', sort=o_win_data['position'].tolist(), title="X's Starting Position"),
    y=alt.Y('o_win_rate', title="Win Rate for O"),
    tooltip=['position', 'o_win_rate']
).properties(
    width=600,
    height=400,
    title="Win Rate for O by X's Starting Position"
).interactive()
Out[312]:
In [312]:
 

We can see that X has higher chances in succeding if started in pos00 or pos11.

In [351]:
# meta_data = tic_tac_toe.iloc[:, 9:]
# pos00 = tic_tac_toe.iloc[:, 0:1]
# pos11 = tic_tac_toe.iloc[:, 4:5]
# poss = pd.concat([pos00, pos11], axis=1)
# filtered_data = pd.concat([pos00, pos11, meta_data], axis=1)

# filtered_data = tic_tac_toe[['pos00', 'pos11', 'result', 'game_id', 'player', 'state', 'step']]
# one_hot_data = pd.get_dummies(poss)
# one_hot_data.head()

meta_data = tic_tac_toe.iloc[:, 10:]
data = tic_tac_toe.iloc[:, :10]
one_hot_data = pd.get_dummies(data)
one_hot_data.head()
#meta_data.head()
Out[351]:
pos00 pos01 pos02 pos10 pos11 pos12 pos20 pos21 pos22 player
0 0 0 0 0 0 0 0 0 1 1
1 0 0 0 0 -1 0 0 0 1 -1
2 1 0 0 0 -1 0 0 0 1 1
3 1 0 -1 0 -1 0 0 0 1 -1
4 1 0 -1 0 -1 1 0 0 1 1
In [352]:
perplexity = 50
tsne_coords = manifold.TSNE(perplexity=perplexity).fit_transform(one_hot_data)
In [353]:
df_tsne_coords = pd.DataFrame(tsne_coords, columns=['X', 'Y'])

tsne_ttt = pd.concat([meta_data.reset_index(drop=True), df_tsne_coords], axis=1)
tsne_ttt = pd.concat([tsne_ttt, data.reset_index(drop=True)], axis=1)
tsne_ttt.head()
Out[353]:
step result game_id state X Y pos00 pos01 pos02 pos10 pos11 pos12 pos20 pos21 pos22 player
0 0 1 1 start 53.072693 -30.277599 0 0 0 0 0 0 0 0 1 1
1 1 1 1 mid -29.429220 -10.776819 0 0 0 0 -1 0 0 0 1 -1
2 2 1 1 mid 1.893498 36.526287 1 0 0 0 -1 0 0 0 1 1
3 3 1 1 mid -32.893929 19.521561 1 0 -1 0 -1 0 0 0 1 -1
4 4 1 1 mid 2.626123 11.310457 1 0 -1 0 -1 1 0 0 1 1
In [354]:
# Create the chart
chart = alt.Chart(tsne_ttt).mark_circle(opacity=0.6).encode(
    x='X',
    y='Y',
    color='result:N',  # Color by result for better clarity
    tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q', 'pos00:N', 'pos01:N', 'pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
).properties(
    width=500,
    height=400,
    title=f"t-SNE Projection with Perplexity {perplexity}"
).interactive()

# Display the chart
chart.display()
In [355]:
tsne_ttt = tsne_ttt[(tsne_ttt['state'] != 'mid')]
tsne_ttt = tsne_ttt[(tsne_ttt['pos11'] == 1)]
start_states = tsne_ttt[tsne_ttt['state'] == 'start']


alt.Chart(tsne_ttt).mark_line(
    opacity=0.3
).encode(
    x='X',
    y='Y',
    detail='game_id:N', # draw one line per attempt, but ...
    color='result:N', # .. color the lines per solving strategy
    order='index:Q',
    tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q', 'pos00:N', 'pos01:N', 'pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
).properties(
    width=700,
    height=700,
    title="Solving attempts by player win"
).interactive() + alt.Chart(tsne_ttt).transform_filter(
    (datum.state == 'end') | (datum.state == 'start') # no intermediate states
).mark_point().encode(
    x='X',
    y='Y',
    shape='state:N',
    color='result:N', # .. color the lines per solving strategy
    tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q', 'pos00:N', 'pos01:N', 'pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
).properties(
    width=700,
    height=500
).interactive()
Out[355]:
In [371]:
end_states = tsne_ttt[tsne_ttt['state'] == 'end']

# Calculate win/loss/draw counts
result_counts = end_states['result'].value_counts().rename_axis('result').reset_index(name='count')

# Adding a column for percentage of each result
total_games = result_counts['count'].sum()
result_counts['percentage'] = (result_counts['count'] / total_games) * 100

# Mapping result values for clarity (optional)
result_mapping = {1: "Player X Wins", -1: "Player O Wins", 0: "Draw"}
result_counts['description'] = result_counts['result'].map(result_mapping)

print("Win/Loss/Draw Statistics:")
print(result_counts[['description', 'count', 'percentage']])
Win/Loss/Draw Statistics:
     description  count  percentage
0  Player X Wins    209   48.946136
1  Player O Wins    124   29.039813
2           Draw     94   22.014052
In [381]:
meta_data = tic_tac_toe.iloc[:, 10:]
data = tic_tac_toe.iloc[:, :10]
one_hot_data = pd.get_dummies(data)
perplexity = 50
tsne_coords = manifold.TSNE(perplexity=perplexity).fit_transform(one_hot_data)
df_tsne_coords = pd.DataFrame(tsne_coords, columns=['X', 'Y'])
tsne_ttt = pd.concat([meta_data.reset_index(drop=True), df_tsne_coords], axis=1)
tsne_ttt = pd.concat([tsne_ttt, data.reset_index(drop=True)], axis=1)
In [382]:
# tsne_ttt = tsne_ttt[(tsne_ttt['state'] != 'mid')]
# tsne_ttt = tsne_ttt[(tsne_ttt['pos02'] == 1) | (tsne_ttt['pos12'] == 1)]
# start_states = tsne_ttt[tsne_ttt['state'] == 'start']
tsne_ttt = tsne_ttt[(tsne_ttt['state'] != 'mid')]
tsne_ttt = tsne_ttt[(tsne_ttt['pos01'] == 1) | (tsne_ttt['pos12'] == 1) | (tsne_ttt['pos21'] == 1)]
start_states = tsne_ttt[tsne_ttt['state'] == 'start']
In [383]:
alt.Chart(tsne_ttt).mark_line(
    opacity=0.3
).encode(
    x='X',
    y='Y',
    detail='game_id:N', # draw one line per attempt, but ...
    color='result:N', # .. color the lines per solving strategy
    order='index:Q',
    tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q', 'pos00:N', 'pos01:N', 'pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
).properties(
    width=700,
    height=700,
    title="Solving attempts by player win"
).interactive() + alt.Chart(tsne_ttt).transform_filter(
    (datum.state == 'end') | (datum.state == 'start') # no intermediate states
).mark_point().encode(
    x='X',
    y='Y',
    shape='state:N',
    color='result:N', # .. color the lines per solving strategy
    tooltip=['game_id:N', 'state:N', 'result:N', 'player:N', 'step:Q', 'pos00:N', 'pos01:N', 'pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
).properties(
    width=700,
    height=500
).interactive()+ alt.Chart(start_states).mark_point(
    size=100,
    color='red'
).encode(
    x='X',
    y='Y',
    tooltip=['game_id', 'state', 'result', 'pos00:N','pos01:N','pos02:N','pos10:N','pos11:N','pos12:N','pos20:N','pos21:N','pos22:N']
)
Out[383]:
In [384]:
end_states = tsne_ttt[tsne_ttt['state'] == 'end']

# Calculate win/loss/draw counts
result_counts = end_states['result'].value_counts().rename_axis('result').reset_index(name='count')

# Adding a column for percentage of each result
total_games = result_counts['count'].sum()
result_counts['percentage'] = (result_counts['count'] / total_games) * 100

# Mapping result values for clarity (optional)
result_mapping = {1: "Player X Wins", -1: "Player O Wins", 0: "Draw"}
result_counts['description'] = result_counts['result'].map(result_mapping)

print("Win/Loss/Draw Statistics:")
print(result_counts[['description', 'count', 'percentage']])
Win/Loss/Draw Statistics:
     description  count  percentage
0  Player X Wins    449   53.452381
1  Player O Wins    248   29.523810
2           Draw    143   17.023810

Comments¶

  • Which features did you use? Why?
  • How are the features encoded?

TODO

Optional¶

Projection Space Explorer (click to reveal)

Projection Space Explorer

The Projection Space Explorer is a web application to plot and connect two dimensional points. Metadata of the data points can be used to encode additonal information into the projection, e.g., by using different shapes or colors. Further Information:
  • Paper: https://jku-vds-lab.at/publications/2020_tiis_pathexplorer/
  • Repo: https://github.com/jku-vds-lab/projection-space-explorer/
  • Application Overview: https://jku-vds-lab.at/pse/

Data Format

How to format the data can be found in the Projection Space Explorer's README. Example data with three lines, with two colors (algo) and additional mark encoding (cp):
x y line cp algo
0.0 0 0 start 1
2.0 1 0 state 1
4.0 4 0 state 1
6.0 1 0 state 1
8.0 0 0 state 1
12.0 0 0 end 1
-1.0 10 1 start 2
0.5 5 1 state 2
2.0 3 1 state 2
3.5 0 1 state 2
5.0 3 1 state 2
6.5 5 1 state 2
8.0 10 1 end 2
3.0 6 2 start 2
2.0 7 2 end 2
Save the dataset to CSV, e.g. using pandas: df.to_csv('data_path_explorer.csv', encoding='utf-8', index=False) and upload it in the Projection Space Explorer by clicking on `OPEN FILE` in the top left corner. ℹ You can also include your high dimensionmal data and use it to adapt the visualization.

Results¶

You may add additional screenshots of the Projection Space Explorer.

In [321]:
# TODO

Interpretion¶

  • What can be seen in the projection(s)?
  • Was it what you expected? If not what did you expect?
  • Can you confirm prior hypotheses from the projection?
  • Did you get any unexpected insights?

TODO

Submission¶

When you’ve finished working on this assignment please download this notebook as HTML and add it to your repository in addition to the notebook file.